25 research outputs found

    A context-aware tourism recommendation system

    Get PDF

    Knowledge extraction and popularity modeling using social media

    Get PDF

    Semantics-driven event clustering in Twitter feeds

    Get PDF
    Detecting events using social media such as Twitter has many useful applications in real-life situations. Many algorithms which all use different information sources - either textual, temporal, geographic or community features - have been developed to achieve this task. Semantic information is often added at the end of the event detection to classify events into semantic topics. But semantic information can also be used to drive the actual event detection, which is less covered by academic research. We therefore supplemented an existing baseline event clustering algorithm with semantic information about the tweets in order to improve its performance. This paper lays out the details of the semantics-driven event clustering algorithms developed, discusses a novel method to aid in the creation of a ground truth for event detection purposes, and analyses how well the algorithms improve over baseline. We find that assigning semantic information to every individual tweet results in just a worse performance in F1 measure compared to baseline. If however semantics are assigned on a coarser, hashtag level the improvement over baseline is substantial and significant in both precision and recall

    Modeling and predicting the popularity of online news based on temporal and content-related features

    Get PDF
    As the market of globally available online news is large and still growing, there is a strong competition between online publishers in order to reach the largest possible audience. Therefore an intelligent online publishing strategy is of the highest importance to publishers. A prerequisite for being able to optimize any online strategy, is to have trustworthy predictions of how popular new online content may become. This paper presents a novel methodology to model and predict the popularity of online news. We first introduce a new strategy and mathematical model to capture view patterns of online news. After a thorough analysis of such view patterns, we show that well-chosen base functions lead to suitable models, and show how the influence of day versus night on the total view patterns can be taken into account to further increase the accuracy, without leading to more complex models. Second, we turn to the prediction of future popularity, given recently published content. By means of a new real-world dataset, we show that the combination of features related to content, meta-data, and the temporal behavior leads to significantly improved predictions, compared to existing approaches which only consider features based on the historical popularity of the considered articles. Whereas traditionally linear regression is used for the application under study, we show that the more expressive gradient tree boosting method proves beneficial for predicting news popularity

    Representation learning for very short texts using weighted word embedding aggregation

    Full text link
    Short text messages such as tweets are very noisy and sparse in their use of vocabulary. Traditional textual representations, such as tf-idf, have difficulty grasping the semantic meaning of such texts, which is important in applications such as event detection, opinion mining, news recommendation, etc. We constructed a method based on semantic word embeddings and frequency information to arrive at low-dimensional representations for short texts designed to capture semantic similarity. For this purpose we designed a weight-based model and a learning procedure based on a novel median-based loss function. This paper discusses the details of our model and the optimization methods, together with the experimental results on both Wikipedia and Twitter data. We find that our method outperforms the baseline approaches in the experiments, and that it generalizes well on different word embeddings without retraining. Our method is therefore capable of retaining most of the semantic information in the text, and is applicable out-of-the-box.Comment: 8 pages, 3 figures, 2 tables, appears in Pattern Recognition Letter

    Using social media to find places of interest: a case study

    Get PDF
    In this paper, we show how the large amount of geographically annotated data in social media can be used to complement existing place databases. After explaining our method, we illustrate how this approach can be used to discover new instances of a given semantic type, using London as a case study. In particular, for several place types, our method finds places in London that are not yet contained in the databases used by Foursquare, Google, LinkedGeoData and Geonames. Encouraged by these results, we briefly sketch how similar techniques could potentially be used to identify likely errors in existing databases, to estimate the spatial extent of places, to discover semantic relationships between place types, and to recommend tags to users who are uploading photos

    Detecting places of interest using social media

    Get PDF
    corecore